{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "\n",
    "`Learn the Basics <intro.html>`_ ||\n",
    "`Quickstart <quickstart_tutorial.html>`_ ||\n",
    "`Tensors <tensorqs_tutorial.html>`_ ||\n",
    "`Datasets & DataLoaders <data_tutorial.html>`_ ||\n",
    "`Transforms <transforms_tutorial.html>`_ ||\n",
    "**Build Model** ||\n",
    "`Autograd <autogradqs_tutorial.html>`_ ||\n",
    "`Optimization <optimization_tutorial.html>`_ ||\n",
    "`Save & Load Model <saveloadrun_tutorial.html>`_\n",
    "\n",
    "Build the Neural Network\n",
    "===================\n",
    "\n",
    "Neural networks comprise of layers/modules that perform operations on data.\n",
    "The `torch.nn <https://pytorch.org/docs/stable/nn.html>`_ namespace provides all the building blocks you need to\n",
    "build your own neural network. Every module in PyTorch subclasses the `nn.Module <https://pytorch.org/docs/stable/generated/torch.nn.Module.html>`_.\n",
    "A neural network is a module itself that consists of other modules (layers). This nested structure allows for\n",
    "building and managing complex architectures easily.\n",
    "\n",
    "In the following sections, we'll build a neural network to classify images in the FashionMNIST dataset.\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "import os\r\n",
    "import torch\r\n",
    "from torch import nn\r\n",
    "from torch.utils.data import DataLoader\r\n",
    "from torchvision import datasets, transforms"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Get Device for Training\n",
    "-----------------------\n",
    "We want to be able to train our model on a hardware accelerator like the GPU,\n",
    "if it is available. Let's check to see if\n",
    "`torch.cuda <https://pytorch.org/docs/stable/notes/cuda.html>`_ is available, else we\n",
    "continue to use the CPU.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Using cpu device\n"
     ]
    }
   ],
   "source": [
    "device = 'cuda' if torch.cuda.is_available() else 'cpu'\r\n",
    "print('Using {} device'.format(device))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Define the Class\n",
    "-------------------------\n",
    "We define our neural network by subclassing ``nn.Module``, and\n",
    "initialize the neural network layers in ``__init__``. Every ``nn.Module`` subclass implements\n",
    "the operations on input data in the ``forward`` method.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "class NeuralNetwork(nn.Module):\r\n",
    "    def __init__(self):\r\n",
    "        super(NeuralNetwork, self).__init__()\r\n",
    "        self.flatten = nn.Flatten()\r\n",
    "        self.linear_relu_stack = nn.Sequential(\r\n",
    "            nn.Linear(28*28, 512),\r\n",
    "            nn.ReLU(),\r\n",
    "            nn.Linear(512, 512),\r\n",
    "            nn.ReLU(),\r\n",
    "            nn.Linear(512, 10),\r\n",
    "        )\r\n",
    "\r\n",
    "    def forward(self, x):\r\n",
    "        x = self.flatten(x)\r\n",
    "        logits = self.linear_relu_stack(x)\r\n",
    "        return logits"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We create an instance of ``NeuralNetwork``, and move it to the ``device``, and print\n",
    "its structure.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NeuralNetwork(\n",
      "  (flatten): Flatten(start_dim=1, end_dim=-1)\n",
      "  (linear_relu_stack): Sequential(\n",
      "    (0): Linear(in_features=784, out_features=512, bias=True)\n",
      "    (1): ReLU()\n",
      "    (2): Linear(in_features=512, out_features=512, bias=True)\n",
      "    (3): ReLU()\n",
      "    (4): Linear(in_features=512, out_features=10, bias=True)\n",
      "  )\n",
      ")\n"
     ]
    }
   ],
   "source": [
    "model = NeuralNetwork().to(device)\r\n",
    "print(model)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To use the model, we pass it the input data. This executes the model's ``forward``,\n",
    "along with some `background operations <https://github.com/pytorch/pytorch/blob/270111b7b611d174967ed204776985cefca9c144/torch/nn/modules/module.py#L866>`_.\n",
    "Do not call ``model.forward()`` directly!\n",
    "\n",
    "Calling the model on the input returns a 10-dimensional tensor with raw predicted values for each class.\n",
    "We get the prediction probabilities by passing it through an instance of the ``nn.Softmax`` module.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Predicted class: tensor([1])\n"
     ]
    }
   ],
   "source": [
    "X = torch.rand(1, 28, 28, device=device)\r\n",
    "logits = model(X)\r\n",
    "pred_probab = nn.Softmax(dim=1)(logits)\r\n",
    "y_pred = pred_probab.argmax(1)\r\n",
    "print(f\"Predicted class: {y_pred}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "--------------\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Model Layers\n",
    "-------------------------\n",
    "\n",
    "Let's break down the layers in the FashionMNIST model. To illustrate it, we\n",
    "will take a sample minibatch of 3 images of size 28x28 and see what happens to it as\n",
    "we pass it through the network.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([3, 28, 28])\n"
     ]
    }
   ],
   "source": [
    "input_image = torch.rand(3,28,28)\r\n",
    "print(input_image.size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "nn.Flatten\n",
    "^^^^^^^^^^^^^^^^^^^^^^\n",
    "We initialize the `nn.Flatten  <https://pytorch.org/docs/stable/generated/torch.nn.Flatten.html>`_\n",
    "layer to convert each 2D 28x28 image into a contiguous array of 784 pixel values (\n",
    "the minibatch dimension (at dim=0) is maintained).\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([3, 784])\n"
     ]
    }
   ],
   "source": [
    "flatten = nn.Flatten()\r\n",
    "flat_image = flatten(input_image)\r\n",
    "print(flat_image.size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "nn.Linear\n",
    "^^^^^^^^^^^^^^^^^^^^^^\n",
    "The `linear layer <https://pytorch.org/docs/stable/generated/torch.nn.Linear.html>`_\n",
    "is a module that applies a linear transformation on the input using its stored weights and biases.\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "torch.Size([3, 20])\n"
     ]
    }
   ],
   "source": [
    "layer1 = nn.Linear(in_features=28*28, out_features=20)\r\n",
    "hidden1 = layer1(flat_image)\r\n",
    "print(hidden1.size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "nn.ReLU\n",
    "^^^^^^^^^^^^^^^^^^^^^^\n",
    "Non-linear activations are what create the complex mappings between the model's inputs and outputs.\n",
    "They are applied after linear transformations to introduce *nonlinearity*, helping neural networks\n",
    "learn a wide variety of phenomena.\n",
    "\n",
    "In this model, we use `nn.ReLU <https://pytorch.org/docs/stable/generated/torch.nn.ReLU.html>`_ between our\n",
    "linear layers, but there's other activations to introduce non-linearity in your model.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Before ReLU: tensor([[-0.1295,  0.0663, -0.0645, -0.5348,  0.1539,  0.3124,  0.2856, -0.3842,\n",
      "          0.0126, -0.5454,  0.2677, -0.1842, -0.5987, -0.0048, -0.2771,  0.4133,\n",
      "          0.3858, -0.2181,  0.1678, -0.4557],\n",
      "        [-0.3826,  0.4261,  0.1206, -0.3975, -0.0019,  0.5272,  0.1900, -0.3786,\n",
      "          0.0234, -0.4552,  0.4658, -0.1285, -0.2334,  0.2906, -0.2341,  0.2080,\n",
      "          0.3831, -0.0781, -0.1343, -0.5645],\n",
      "        [ 0.2347,  0.0894,  0.1233, -0.6010,  0.3294,  0.2070,  0.2580, -0.1984,\n",
      "          0.1275, -0.5655,  0.2555,  0.0117, -0.1809,  0.2718, -0.1680,  0.3696,\n",
      "         -0.0438, -0.0999,  0.1566, -0.1699]], grad_fn=<AddmmBackward>)\n",
      "\n",
      "\n",
      "After ReLU: tensor([[0.0000, 0.0663, 0.0000, 0.0000, 0.1539, 0.3124, 0.2856, 0.0000, 0.0126,\n",
      "         0.0000, 0.2677, 0.0000, 0.0000, 0.0000, 0.0000, 0.4133, 0.3858, 0.0000,\n",
      "         0.1678, 0.0000],\n",
      "        [0.0000, 0.4261, 0.1206, 0.0000, 0.0000, 0.5272, 0.1900, 0.0000, 0.0234,\n",
      "         0.0000, 0.4658, 0.0000, 0.0000, 0.2906, 0.0000, 0.2080, 0.3831, 0.0000,\n",
      "         0.0000, 0.0000],\n",
      "        [0.2347, 0.0894, 0.1233, 0.0000, 0.3294, 0.2070, 0.2580, 0.0000, 0.1275,\n",
      "         0.0000, 0.2555, 0.0117, 0.0000, 0.2718, 0.0000, 0.3696, 0.0000, 0.0000,\n",
      "         0.1566, 0.0000]], grad_fn=<ReluBackward0>)\n"
     ]
    }
   ],
   "source": [
    "print(f\"Before ReLU: {hidden1}\\n\\n\")\r\n",
    "hidden1 = nn.ReLU()(hidden1)\r\n",
    "print(f\"After ReLU: {hidden1}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "nn.Sequential\n",
    "^^^^^^^^^^^^^^^^^^^^^^\n",
    "`nn.Sequential <https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html>`_ is an ordered\n",
    "container of modules. The data is passed through all the modules in the same order as defined. You can use\n",
    "sequential containers to put together a quick network like ``seq_modules``.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "seq_modules = nn.Sequential(\r\n",
    "    flatten,\r\n",
    "    layer1,\r\n",
    "    nn.ReLU(),\r\n",
    "    nn.Linear(20, 10)\r\n",
    ")\r\n",
    "input_image = torch.rand(3,28,28)\r\n",
    "logits = seq_modules(input_image)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "nn.Softmax\n",
    "^^^^^^^^^^^^^^^^^^^^^^\n",
    "The last linear layer of the neural network returns `logits` - raw values in [-\\infty, \\infty] - which are passed to the\n",
    "`nn.Softmax <https://pytorch.org/docs/stable/generated/torch.nn.Softmax.html>`_ module. The logits are scaled to values\n",
    "[0, 1] representing the model's predicted probabilities for each class. ``dim`` parameter indicates the dimension along\n",
    "which the values must sum to 1.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [],
   "source": [
    "softmax = nn.Softmax(dim=1)\r\n",
    "pred_probab = softmax(logits)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Model Parameters\n",
    "-------------------------\n",
    "Many layers inside a neural network are *parameterized*, i.e. have associated weights\n",
    "and biases that are optimized during training. Subclassing ``nn.Module`` automatically\n",
    "tracks all fields defined inside your model object, and makes all parameters\n",
    "accessible using your model's ``parameters()`` or ``named_parameters()`` methods.\n",
    "\n",
    "In this example, we iterate over each parameter, and print its size and a preview of its values.\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false,
    "jupyter": {
     "outputs_hidden": false
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Model structure:  NeuralNetwork(\n",
      "  (flatten): Flatten(start_dim=1, end_dim=-1)\n",
      "  (linear_relu_stack): Sequential(\n",
      "    (0): Linear(in_features=784, out_features=512, bias=True)\n",
      "    (1): ReLU()\n",
      "    (2): Linear(in_features=512, out_features=512, bias=True)\n",
      "    (3): ReLU()\n",
      "    (4): Linear(in_features=512, out_features=10, bias=True)\n",
      "  )\n",
      ") \n",
      "\n",
      "\n",
      "Layer: linear_relu_stack.0.weight | Size: torch.Size([512, 784]) | Values : tensor([[-0.0189,  0.0294, -0.0256,  ..., -0.0298, -0.0196,  0.0139],\n",
      "        [ 0.0254,  0.0210, -0.0105,  ..., -0.0308, -0.0055,  0.0352]],\n",
      "       grad_fn=<SliceBackward>) \n",
      "\n",
      "Layer: linear_relu_stack.0.bias | Size: torch.Size([512]) | Values : tensor([-0.0211,  0.0354], grad_fn=<SliceBackward>) \n",
      "\n",
      "Layer: linear_relu_stack.2.weight | Size: torch.Size([512, 512]) | Values : tensor([[-0.0286, -0.0089,  0.0331,  ..., -0.0046, -0.0430, -0.0414],\n",
      "        [ 0.0203,  0.0220, -0.0051,  ...,  0.0293, -0.0257, -0.0089]],\n",
      "       grad_fn=<SliceBackward>) \n",
      "\n",
      "Layer: linear_relu_stack.2.bias | Size: torch.Size([512]) | Values : tensor([ 0.0252, -0.0344], grad_fn=<SliceBackward>) \n",
      "\n",
      "Layer: linear_relu_stack.4.weight | Size: torch.Size([10, 512]) | Values : tensor([[ 0.0389,  0.0216, -0.0089,  ..., -0.0077,  0.0306, -0.0440],\n",
      "        [ 0.0135,  0.0139,  0.0314,  ...,  0.0312,  0.0409,  0.0247]],\n",
      "       grad_fn=<SliceBackward>) \n",
      "\n",
      "Layer: linear_relu_stack.4.bias | Size: torch.Size([10]) | Values : tensor([-0.0389,  0.0327], grad_fn=<SliceBackward>) \n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(\"Model structure: \", model, \"\\n\\n\")\r\n",
    "\r\n",
    "for name, param in model.named_parameters():\r\n",
    "    print(f\"Layer: {name} | Size: {param.size()} | Values : {param[:2]} \\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "--------------\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Further Reading\n",
    "--------------\n",
    "- `torch.nn API <https://pytorch.org/docs/stable/nn.html>`_\n",
    "\n"
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "1baa965d5efe3ac65b79dfc60c0d706280b1da80fedb7760faf2759126c4f253"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
